139 research outputs found

    Measurement Error in Lasso: Impact and Correction

    Full text link
    Regression with the lasso penalty is a popular tool for performing dimension reduction when the number of covariates is large. In many applications of the lasso, like in genomics, covariates are subject to measurement error. We study the impact of measurement error on linear regression with the lasso penalty, both analytically and in simulation experiments. A simple method of correction for measurement error in the lasso is then considered. In the large sample limit, the corrected lasso yields sign consistent covariate selection under conditions very similar to the lasso with perfect measurements, whereas the uncorrected lasso requires much more stringent conditions on the covariance structure of the data. Finally, we suggest methods to correct for measurement error in generalized linear models with the lasso penalty, which we study empirically in simulation experiments with logistic regression, and also apply to a classification problem with microarray data. We see that the corrected lasso selects less false positives than the standard lasso, at a similar level of true positives. The corrected lasso can therefore be used to obtain more conservative covariate selection in genomic analysis

    Longitudinal modeling of age-dependent latent traits with generalized additive latent and mixed models

    Full text link
    We present generalized additive latent and mixed models (GALAMMs) for analysis of clustered data with responses and latent variables depending smoothly on observed variables. A scalable maximum likelihood estimation algorithm is proposed, utilizing the Laplace approximation, sparse matrix computation, and automatic differentiation. Mixed response types, heteroscedasticity, and crossed random effects are naturally incorporated into the framework. The models developed were motivated by applications in cognitive neuroscience, and two case studies are presented. First, we show how GALAMMs can jointly model the complex lifespan trajectories of episodic memory, working memory, and speed/executive function, measured by the California Verbal Learning Test (CVLT), digit span tests, and Stroop tests, respectively. Next, we study the effect of socioeconomic status on brain structure, using data on education and income together with hippocampal volumes estimated by magnetic resonance imaging. By combining semiparametric estimation with latent variable modeling, GALAMMs allow a more realistic representation of how brain and cognition vary across the lifespan, while simultaneously estimating latent traits from measured items. Simulation experiments suggest that model estimates are accurate even with moderate sample sizes

    Developing the HLS19-YP12 for measuring health literacy in young people: a latent trait analysis using Rasch modelling and confirmatory factor analysis

    Get PDF
    Background Accurate and precise measures of health literacy (HL) is supportive for health policy making, tailoring health service design, and ensuring equitable access to health services. According to research, valid and reliable unidimensional HL measurement instruments explicitly targeted at young people (YP) are scarce. Thus, this study aims at assessing the psychometric properties of existing unidimensional instruments and developing an HL instrument suitable for YP aged 16–25 years. Methods Applying the HLS19-Q47 in computer-assisted telephone interviews, we collected data in a representative sample comprising 890 YP aged 16–25 years in Norway. Applying the partial credit parameterization of the unidimensional Rasch model for polytomous data (PCM) and confirmatory factor analysis (CFA) with categorical variables, we evaluated the psychometric properties of the short versions of the HLS19-Q47; HLS19-Q12, HLS19-SF12, and HLS19-Q12-NO. A new 12-item short version for measuring HL in YP, HLS19-YP12, is suggested. Results The HLS19-Q12 did not display sufficient fit to the PCM, and the HLS19-SF12 was not sufficiently unidimensional. Relative to the PCM, some items in the HLS19-Q12, the HLS19-SF12, and the HLS19-Q12-NO discriminated poorly between participants at high and at low locations on the underlying latent trait. We observed disordered response categories for some items in the HLS19-Q12 and the HLS19-SF12. A few items in the HLS19-Q12, the HLS19-SF12, and the HLS19-Q12-NO displayed either uniform or non-uniform differential item functioning. Applying one-factorial CFA, none of the aforementioned short versions achieved exact fit in terms of non-significant model chi-square statistic, or approximate fit in terms of SRMR ≤ .080 and all entries ≤ .10 that were observed in the respective residual matrix. The newly suggested parsimonious 12-item scale, HLS19-YP12, displayed sufficiently fit to the PCM and achieved approximate fit using one-factorial CFA. Conclusions Compared to other parsimonious 12-item short versions of HLS19-Q47, the HLS19-YP12 has superior psychometric properties and unconditionally proved its unidimensionality. The HLS19-YP12 offers an efficient and much-needed screening tool for use among YP, which is likely a useful application in processes towards the development and evaluation of health policy and public health work, as well as for use in clinical settings.publishedVersio

    En studie av potensialet for utnyttelse av termoelektrisk energikonventering på ULA-klassen. Økt slagkraft til de norske ubåtene ved hjelp av termoelektrisitet.

    Get PDF
    Denne studien har undersøkt om det er mulig og om det er gunstig å utnytte spillvarmen til dieselgeneratoren på de norske ubåtene ULA-klassen ved hjelp av en termoelektrisk generator (TEG). Et termoelektrisk element konverterer elektrisk energi der det er finnes en temperaturforskjell, og systemet har ingen bevegelige deler. Det påhengte sjøkjølevannsystem har som formål å redusere støy og termisk signatur og dermed oppstår det spillvarme. Siden termoelektrisitet ikke er pensum og ikke er en del av undervisningen på Sjøkrigsskolen, inneholder oppgaven en grundig utredelse om hva termoelektrisitet er. For å besvare problemstillingen er det først gjort utredelser om det er mulig å utnytte spillvarmen ved hjelp av termoelektrisitet. Deretter er det gjort utredelser for om det er gunstig med en slik installasjon. For å besvare om det er gunstig har vi først sett på en optimal løsning. Deretter er det evaluert etter to aspekter: kostnad og strategisk kapasitet. Resultatene viste at ved en installasjon av en termoelektrisk generator så vil følgende skje: Termisk signatur øker med 34 C, samt effektøkning på cirka 10kW. Effektøkningen bidrar til økt rekkevidde per tokt med 62 nautiske mil, og ved en snorkletid på 20 minutt kan en redusere snorkletiden med 11 sekunder. Det har ikke blitt laget en prototype for testing i denne oppgaven, det må derfor tas forbehold om feil i beregninger siden vi ikke har fått verifisert dem. Likevel mener vi at det gir en god indikasjon på hva som blir påvirket og hvordan det blir påvirket. Dette gjør det lettere for oss å konkludere med at det vi tror er riktig. Ved å evaluere de to aspektene kostnad og strategisk kapasitet kom vi frem til at svaret på problemstillingen er at det ikke er gunstig med en termoelektrisk installasjon på den eksisterende varmeveksler. Grunnen til dette er fordi en installasjon ikke gir høy nok militær strategisk effekt. Snarere tvert imot er fordelene av installasjonen slik resultatene viser ubetydelige. Oppgaven gir også en kort utredelse av mulighetene for en ny konfigurasjon av en varmeveksler på de fremtidige ubåtene. Disse resultatene viser at det er mulig å generere opp til 10 ganger så mye effekt som gir økt rekkevidde tilsvarende norskekysten per tokt, samtidig som snorkletiden reduseres med 1 minutt og 30 sekund, - i tillegg reduseres termisk signatur. Vi anbefaler derfor at det gjøres videre utredelser på potensiale for en mulig installasjon av en TEG på de nye norske ubåtene

    Effects of mRNA amplification on gene expression ratios in cDNA experiments estimated by analysis of variance

    Get PDF
    BACKGROUND: A limiting factor of cDNA microarray technology is the need for a substantial amount of RNA per labeling reaction. Thus, 20–200 micro-grams total RNA or 0.5–2 micro-grams poly (A) RNA is typically required for monitoring gene expression. In addition, gene expression profiles from large, heterogeneous cell populations provide complex patterns from which biological data for the target cells may be difficult to extract. In this study, we chose to investigate a widely used mRNA amplification protocol that allows gene expression studies to be performed on samples with limited starting material. We present a quantitative study of the variation and noise present in our data set obtained from experiments with either amplified or non-amplified material. RESULTS: Using analysis of variance (ANOVA) and multiple hypothesis testing, we estimated the impact of amplification on the preservation of gene expression ratios. Both methods showed that the gene expression ratios were not completely preserved between amplified and non-amplified material. We also compared the expression ratios between the two cell lines for the amplified material with expression ratios between the two cell lines for the non-amplified material for each gene. With the aid of multiple t-testing with a false discovery rate of 5%, we found that 10% of the genes investigated showed significantly different expression ratios. CONCLUSION: Although the ratios were not fully preserved, amplification may prove to be extremely useful with respect to characterizing low expressing genes

    Biomarker profiling beyond amyloid and tau: cerebrospinal fluid markers, hippocampal atrophy, and memory change in cognitively unimpaired older adults

    Get PDF
    Brain changes occurring in aging can be indexed by biomarkers. We used cluster analysis to identify subgroups of cognitively unimpaired individuals (n ¼ 99, 64e93 years) with different profiles of the cerebrospinal fluid biomarkers beta amyloid 1e42 (Ab42), phosphorylated tau (P-tau), total tau, chitinase-3-like protein 1 (YKL-40), fatty acid binding protein 3 (FABP3), and neurofilament light (NFL). Hippocampal volume and memory were assessed across multiple follow-up examinations covering up to 6.8 years. Clustering revealed one group (39%) with more pathological concentrations of all biomarkers, which could further be divided into one group (20%) characterized by tauopathy and high FABP3 and one (19%) by brain b-amyloidosis, high NFL, and slightly higher YKL-40. The clustering approach clearly outperformed classification based on Ab42 and P-tau alone in prediction of memory decline, with the individuals with most tauopathy and FABP3 showing more memory decline, but not more hippocampal volume change. The results demonstrate that older adults can be classified based on biomarkers beyond amyloid and tau, with improved prediction of memory decline
    • …
    corecore